Performance Analysis of k-NN on High Dimensional Datasets
نویسندگان
چکیده
منابع مشابه
Performance Analysis of k-NN on High Dimensional Datasets
Research on classifying high dimensional datasets is an open direction in the pattern recognition yet. High dimensional feature spaces cause scalability problems for machine learning algorithms because the complexity of a high dimensional space increases exponentially with the number of features. Recently a number of ensemble techniques using different classifiers have proposed for classifying ...
متن کاملA high performance k-NN approach using binary neural networks
This paper evaluates a novel k-nearest neighbour (k-NN) classifier built from binary neural networks. The binary neural approach uses robust encoding to map standard ordinal, categorical and numeric data sets onto a binary neural network. The binary neural network uses high speed pattern matching to recall a candidate set of matching records, which are then processed by a conventional k-NN appr...
متن کاملPredicting the Performance of Index Structures for High-Dimensional Datasets
We present two new models for predicting the number of index page accesses during nearest neighbor queries for high-dimensional datasets, a density-based model that depends weakly on the underlying index structure and results in coarser predictions, and a sampling-based model that depends strongly on the underlying index structure and gives more accurate predictions. We give detailed evaluation...
متن کاملSentiment Analysis of Review Datasets Using Naive Bayes and K-NN Classifier
The advent of Web 2.0 has led to an increase in the amount of sentimental content available in the Web. Such content is often found in social media web sites in the form of movie or product reviews, user comments, testimonials, messages in discussion forums etc. Timely discovery of the sentimental or opinionated web content has a number of advantages, the most important of all being monetizatio...
متن کاملComparing MapReduce-Based k-NN Similarity Joins on Hadoop for High-Dimensional Data
Similarity joins represent a useful operator for data mining, data analysis and data exploration applications. With the exponential growth of data to be analyzed, distributed approaches like MapReduce are required. So far, the state-of-the-art similarity join approaches based on MapReduce mainly focused on the processing of low-dimensional vector data. In this paper, we revisit and investigate ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: International Journal of Computer Applications
سال: 2011
ISSN: 0975-8887
DOI: 10.5120/1988-2678